The COVID-19 pandemic still heavily impacts the United States, with the US surpassing over 200,000 deaths since cases have first been recorded. To shed further insight into the severity of the panemic, the “Provisional COVID-19 Death Count by Sex, Age, and State” (https://data.cdc.gov/resource/9bhg-hcku.json) data was taken from the Center for Disease Control and analyzed. The data includes the number of COCIVD-19 deaths between February 2020 and August 2020 that was reported to the National Center for Health Statistics by sex and age group. In addition, the number of deaths due in which pneumonia, often caused by severe COVID symptoms,was diagnosed alongside with Covid-19 was also included in the data set. Data gathered by this data set is incomplete due to the length of time in which it takes for a death certificate to be completed and submitted to the NCHS after death. Although location by State was also aviable, only 20 states were represented in this data set, leading to a lack of information to make a proper analysis.
The main purpose of this project is to analyze:
1)The effect of age and gender on the COVID-19 mortality rate.
2)The frequency of pneumonia in COVID-19 patients, and its affect on patient mortality.
The dataset, “Provisional COVID-19 Death Count by Sex, Age, and State” (https://data.cdc.gov/resource/9bhg-hcku.json), was accessed from the Center for Disease Control website through an API. Once downloaded, the desired information was extracted through regular expressions and formed into a data table. The key independent variables that were examined in this study were age and gender while the data of interest included number of deaths from COVID-19, number of deaths from pneumonia, and the number of deaths which both COVID-19 and pneumonia were involved.
The age variable was separated into different age groups, including ranges from 0-17, 15-24, 18-29, etc. When the age group was extracted from the raw data, there were observations that did not include data regarding the age. Therefore, those observations were removed.
In addition, there were also overlapping age-ranges in the data set. To prevent double-counting of deaths, the overlapping age-ranges were removed. The final age groups started from age 5 to age 84, broken down into increments of 10 years (5-14,15-24,26-35, etc.)
To ensure the accuracy of the data cleaning process, the number of Covid-19 deaths per age composition was added to determine if it was equal to the total number of Covid-19 deaths reported across all age groups.
The number of deaths due to COVID-19 was separated by gender. Some observation listed the gender as “All Genders.” Because the goal of this project is to determine the influence of gender on COVID-19 mortality, these values were removed. In addition, removing those values would prevent double-counting of the data. Cases where the gender was unknown were also removed.
To ensure the accuracy of the data, the number of COVID-19 deaths for each gender was added to determine if it was equal to the total number of Covid-19 deaths reported throughout all genders.
state
## [1] "United" "United" "United" "United" "United"
## [6] "United" "United" "United" "United" "United"
## [11] "United" "United" "United" "United" "United"
## [16] "United" "United" "United" "United" "United"
## [21] "United" "United" "United" "United" "United"
## [26] "United" "United" "United" "United" "United"
## [31] "United" "United" "United" "United" "United"
## [36] "United" "United" "United" "United" "United"
## [41] "United" "United" "United" "United" "United"
## [46] "United" "United" "United" "United" "United"
## [51] "United" "United" "United" "United" "United"
## [56] "United" "United" "United" "United" "United"
## [61] "United" "United" "United" "United" "Alabama"
## [66] "Alabama" "Alabama" "Alabama" "Alabama" "Alabama"
## [71] "Alabama" "Alabama" "Alabama" "Alabama" "Alabama"
## [76] "Alabama" "Alabama" "Alabama" "Alabama" "Alabama"
## [81] "Alabama" "Alabama" "Alabama" "Alabama" "Alabama"
## [86] "Alabama" "Alabama" "Alabama" "Alabama" "Alabama"
## [91] "Alabama" "Alabama" "Alabama" "Alabama" "Alabama"
## [96] "Alabama" "Alabama" "Alabama" "Alabama" "Alabama"
## [101] "Alabama" "Alabama" "Alabama" "Alabama" "Alabama"
## [106] "Alabama" "Alabama" "Alabama" "Alabama" "Alabama"
## [111] "Alabama" "Alabama" "Alabama" "Alaska" "Alaska"
## [116] "Alaska" "Alaska" "Alaska" "Alaska" "Alaska"
## [121] "Alaska" "Alaska" "Alaska" "Alaska" "Alaska"
## [126] "Alaska" "Alaska" "Alaska" "Alaska" "Alaska"
## [131] "Alaska" "Alaska" "Alaska" "Alaska" "Alaska"
## [136] "Alaska" "Alaska" "Alaska" "Alaska" "Alaska"
## [141] "Alaska" "Alaska" "Alaska" "Alaska" "Alaska"
## [146] "Alaska" "Alaska" "Alaska" "Alaska" "Alaska"
## [151] "Alaska" "Alaska" "Alaska" "Alaska" "Alaska"
## [156] "Alaska" "Alaska" "Alaska" "Alaska" "Alaska"
## [161] "Alaska" "Alaska" "Arizona" "Arizona" "Arizona"
## [166] "Arizona" "Arizona" "Arizona" "Arizona" "Arizona"
## [171] "Arizona" "Arizona" "Arizona" "Arizona" "Arizona"
## [176] "Arizona" "Arizona" "Arizona" "Arizona" "Arizona"
## [181] "Arizona" "Arizona" "Arizona" "Arizona" "Arizona"
## [186] "Arizona" "Arizona" "Arizona" "Arizona" "Arizona"
## [191] "Arizona" "Arizona" "Arizona" "Arizona" "Arizona"
## [196] "Arizona" "Arizona" "Arizona" "Arizona" "Arizona"
## [201] "Arizona" "Arizona" "Arizona" "Arizona" "Arizona"
## [206] "Arizona" "Arizona" "Arizona" "Arizona" "Arizona"
## [211] "Arizona" "Arkansas" "Arkansas" "Arkansas" "Arkansas"
## [216] "Arkansas" "Arkansas" "Arkansas" "Arkansas" "Arkansas"
## [221] "Arkansas" "Arkansas" "Arkansas" "Arkansas" "Arkansas"
## [226] "Arkansas" "Arkansas" "Arkansas" "Arkansas" "Arkansas"
## [231] "Arkansas" "Arkansas" "Arkansas" "Arkansas" "Arkansas"
## [236] "Arkansas" "Arkansas" "Arkansas" "Arkansas" "Arkansas"
## [241] "Arkansas" "Arkansas" "Arkansas" "Arkansas" "Arkansas"
## [246] "Arkansas" "Arkansas" "Arkansas" "Arkansas" "Arkansas"
## [251] "Arkansas" "Arkansas" "Arkansas" "Arkansas" "Arkansas"
## [256] "Arkansas" "Arkansas" "Arkansas" "Arkansas" "Arkansas"
## [261] "California" "California" "California" "California" "California"
## [266] "California" "California" "California" "California" "California"
## [271] "California" "California" "California" "California" "California"
## [276] "California" "California" "California" "California" "California"
## [281] "California" "California" "California" "California" "California"
## [286] "California" "California" "California" "California" "California"
## [291] "California" "California" "California" "California" "California"
## [296] "California" "California" "California" "California" "California"
## [301] "California" "California" "California" "California" "California"
## [306] "California" "California" "California" "California" "Colorado"
## [311] "Colorado" "Colorado" "Colorado" "Colorado" "Colorado"
## [316] "Colorado" "Colorado" "Colorado" "Colorado" "Colorado"
## [321] "Colorado" "Colorado" "Colorado" "Colorado" "Colorado"
## [326] "Colorado" "Colorado" "Colorado" "Colorado" "Colorado"
## [331] "Colorado" "Colorado" "Colorado" "Colorado" "Colorado"
## [336] "Colorado" "Colorado" "Colorado" "Colorado" "Colorado"
## [341] "Colorado" "Colorado" "Colorado" "Colorado" "Colorado"
## [346] "Colorado" "Colorado" "Colorado" "Colorado" "Colorado"
## [351] "Colorado" "Colorado" "Colorado" "Colorado" "Colorado"
## [356] "Colorado" "Colorado" "Colorado" "Connecticut" "Connecticut"
## [361] "Connecticut" "Connecticut" "Connecticut" "Connecticut" "Connecticut"
## [366] "Connecticut" "Connecticut" "Connecticut" "Connecticut" "Connecticut"
## [371] "Connecticut" "Connecticut" "Connecticut" "Connecticut" "Connecticut"
## [376] "Connecticut" "Connecticut" "Connecticut" "Connecticut" "Connecticut"
## [381] "Connecticut" "Connecticut" "Connecticut" "Connecticut" "Connecticut"
## [386] "Connecticut" "Connecticut" "Connecticut" "Connecticut" "Connecticut"
## [391] "Connecticut" "Connecticut" "Connecticut" "Connecticut" "Connecticut"
## [396] "Connecticut" "Connecticut" "Connecticut" "Connecticut" "Connecticut"
## [401] "Connecticut" "Connecticut" "Connecticut" "Connecticut" "Connecticut"
## [406] "Connecticut" "Connecticut" "Delaware" "Delaware" "Delaware"
## [411] "Delaware" "Delaware" "Delaware" "Delaware" "Delaware"
## [416] "Delaware" "Delaware" "Delaware" "Delaware" "Delaware"
## [421] "Delaware" "Delaware" "Delaware" "Delaware" "Delaware"
## [426] "Delaware" "Delaware" "Delaware" "Delaware" "Delaware"
## [431] "Delaware" "Delaware" "Delaware" "Delaware" "Delaware"
## [436] "Delaware" "Delaware" "Delaware" "Delaware" "Delaware"
## [441] "Delaware" "Delaware" "Delaware" "Delaware" "Delaware"
## [446] "Delaware" "Delaware" "Delaware" "Delaware" "Delaware"
## [451] "Delaware" "Delaware" "Delaware" "Delaware" "Delaware"
## [456] "Delaware" "District" "District" "District" "District"
## [461] "District" "District" "District" "District" "District"
## [466] "District" "District" "District" "District" "District"
## [471] "District" "District" "District" "District" "District"
## [476] "District" "District" "District" "District" "District"
## [481] "District" "District" "District" "District" "District"
## [486] "District" "District" "District" "District" "District"
## [491] "District" "District" "District" "District" "District"
## [496] "District" "District" "District" "District" "District"
## [501] "District" "District" "District" "District" "District"
## [506] "Florida" "Florida" "Florida" "Florida" "Florida"
## [511] "Florida" "Florida" "Florida" "Florida" "Florida"
## [516] "Florida" "Florida" "Florida" "Florida" "Florida"
## [521] "Florida" "Florida" "Florida" "Florida" "Florida"
## [526] "Florida" "Florida" "Florida" "Florida" "Florida"
## [531] "Florida" "Florida" "Florida" "Florida" "Florida"
## [536] "Florida" "Florida" "Florida" "Florida" "Florida"
## [541] "Florida" "Florida" "Florida" "Florida" "Florida"
## [546] "Florida" "Florida" "Florida" "Florida" "Florida"
## [551] "Florida" "Florida" "Florida" "Florida" "Georgia"
## [556] "Georgia" "Georgia" "Georgia" "Georgia" "Georgia"
## [561] "Georgia" "Georgia" "Georgia" "Georgia" "Georgia"
## [566] "Georgia" "Georgia" "Georgia" "Georgia" "Georgia"
## [571] "Georgia" "Georgia" "Georgia" "Georgia" "Georgia"
## [576] "Georgia" "Georgia" "Georgia" "Georgia" "Georgia"
## [581] "Georgia" "Georgia" "Georgia" "Georgia" "Georgia"
## [586] "Georgia" "Georgia" "Georgia" "Georgia" "Georgia"
## [591] "Georgia" "Georgia" "Georgia" "Georgia" "Georgia"
## [596] "Georgia" "Georgia" "Georgia" "Georgia" "Georgia"
## [601] "Georgia" "Georgia" "Georgia" "Hawaii" "Hawaii"
## [606] "Hawaii" "Hawaii" "Hawaii" "Hawaii" "Hawaii"
## [611] "Hawaii" "Hawaii" "Hawaii" "Hawaii" "Hawaii"
## [616] "Hawaii" "Hawaii" "Hawaii" "Hawaii" "Hawaii"
## [621] "Hawaii" "Hawaii" "Hawaii" "Hawaii" "Hawaii"
## [626] "Hawaii" "Hawaii" "Hawaii" "Hawaii" "Hawaii"
## [631] "Hawaii" "Hawaii" "Hawaii" "Hawaii" "Hawaii"
## [636] "Hawaii" "Hawaii" "Hawaii" "Hawaii" "Hawaii"
## [641] "Hawaii" "Hawaii" "Hawaii" "Hawaii" "Hawaii"
## [646] "Hawaii" "Hawaii" "Hawaii" "Hawaii" "Hawaii"
## [651] "Hawaii" "Hawaii" "Idaho" "Idaho" "Idaho"
## [656] "Idaho" "Idaho" "Idaho" "Idaho" "Idaho"
## [661] "Idaho" "Idaho" "Idaho" "Idaho" "Idaho"
## [666] "Idaho" "Idaho" "Idaho" "Idaho" "Idaho"
## [671] "Idaho" "Idaho" "Idaho" "Idaho" "Idaho"
## [676] "Idaho" "Idaho" "Idaho" "Idaho" "Idaho"
## [681] "Idaho" "Idaho" "Idaho" "Idaho" "Idaho"
## [686] "Idaho" "Idaho" "Idaho" "Idaho" "Idaho"
## [691] "Idaho" "Idaho" "Idaho" "Idaho" "Idaho"
## [696] "Idaho" "Idaho" "Idaho" "Idaho" "Idaho"
## [701] "Idaho" "Illinois" "Illinois" "Illinois" "Illinois"
## [706] "Illinois" "Illinois" "Illinois" "Illinois" "Illinois"
## [711] "Illinois" "Illinois" "Illinois" "Illinois" "Illinois"
## [716] "Illinois" "Illinois" "Illinois" "Illinois" "Illinois"
## [721] "Illinois" "Illinois" "Illinois" "Illinois" "Illinois"
## [726] "Illinois" "Illinois" "Illinois" "Illinois" "Illinois"
## [731] "Illinois" "Illinois" "Illinois" "Illinois" "Illinois"
## [736] "Illinois" "Illinois" "Illinois" "Illinois" "Illinois"
## [741] "Illinois" "Illinois" "Illinois" "Illinois" "Illinois"
## [746] "Illinois" "Illinois" "Illinois" "Illinois" "Illinois"
## [751] "Indiana" "Indiana" "Indiana" "Indiana" "Indiana"
## [756] "Indiana" "Indiana" "Indiana" "Indiana" "Indiana"
## [761] "Indiana" "Indiana" "Indiana" "Indiana" "Indiana"
## [766] "Indiana" "Indiana" "Indiana" "Indiana" "Indiana"
## [771] "Indiana" "Indiana" "Indiana" "Indiana" "Indiana"
## [776] "Indiana" "Indiana" "Indiana" "Indiana" "Indiana"
## [781] "Indiana" "Indiana" "Indiana" "Indiana" "Indiana"
## [786] "Indiana" "Indiana" "Indiana" "Indiana" "Indiana"
## [791] "Indiana" "Indiana" "Indiana" "Indiana" "Indiana"
## [796] "Indiana" "Indiana" "Indiana" "Indiana" "Iowa"
## [801] "Iowa" "Iowa" "Iowa" "Iowa" "Iowa"
## [806] "Iowa" "Iowa" "Iowa" "Iowa" "Iowa"
## [811] "Iowa" "Iowa" "Iowa" "Iowa" "Iowa"
## [816] "Iowa" "Iowa" "Iowa" "Iowa" "Iowa"
## [821] "Iowa" "Iowa" "Iowa" "Iowa" "Iowa"
## [826] "Iowa" "Iowa" "Iowa" "Iowa" "Iowa"
## [831] "Iowa" "Iowa" "Iowa" "Iowa" "Iowa"
## [836] "Iowa" "Iowa" "Iowa" "Iowa" "Iowa"
## [841] "Iowa" "Iowa" "Iowa" "Iowa" "Iowa"
## [846] "Iowa" "Iowa" "Iowa" "Kansas" "Kansas"
## [851] "Kansas" "Kansas" "Kansas" "Kansas" "Kansas"
## [856] "Kansas" "Kansas" "Kansas" "Kansas" "Kansas"
## [861] "Kansas" "Kansas" "Kansas" "Kansas" "Kansas"
## [866] "Kansas" "Kansas" "Kansas" "Kansas" "Kansas"
## [871] "Kansas" "Kansas" "Kansas" "Kansas" "Kansas"
## [876] "Kansas" "Kansas" "Kansas" "Kansas" "Kansas"
## [881] "Kansas" "Kansas" "Kansas" "Kansas" "Kansas"
## [886] "Kansas" "Kansas" "Kansas" "Kansas" "Kansas"
## [891] "Kansas" "Kansas" "Kansas" "Kansas" "Kansas"
## [896] "Kansas" "Kansas" "Kentucky" "Kentucky" "Kentucky"
## [901] "Kentucky" "Kentucky" "Kentucky" "Kentucky" "Kentucky"
## [906] "Kentucky" "Kentucky" "Kentucky" "Kentucky" "Kentucky"
## [911] "Kentucky" "Kentucky" "Kentucky" "Kentucky" "Kentucky"
## [916] "Kentucky" "Kentucky" "Kentucky" "Kentucky" "Kentucky"
## [921] "Kentucky" "Kentucky" "Kentucky" "Kentucky" "Kentucky"
## [926] "Kentucky" "Kentucky" "Kentucky" "Kentucky" "Kentucky"
## [931] "Kentucky" "Kentucky" "Kentucky" "Kentucky" "Kentucky"
## [936] "Kentucky" "Kentucky" "Kentucky" "Kentucky" "Kentucky"
## [941] "Kentucky" "Kentucky" "Kentucky" "Kentucky" "Kentucky"
## [946] "Kentucky" "Louisiana" "Louisiana" "Louisiana" "Louisiana"
## [951] "Louisiana" "Louisiana" "Louisiana" "Louisiana" "Louisiana"
## [956] "Louisiana" "Louisiana" "Louisiana" "Louisiana" "Louisiana"
## [961] "Louisiana" "Louisiana" "Louisiana" "Louisiana" "Louisiana"
## [966] "Louisiana" "Louisiana" "Louisiana" "Louisiana" "Louisiana"
## [971] "Louisiana" "Louisiana" "Louisiana" "Louisiana" "Louisiana"
## [976] "Louisiana" "Louisiana" "Louisiana" "Louisiana" "Louisiana"
## [981] "Louisiana" "Louisiana" "Louisiana" "Louisiana" "Louisiana"
## [986] "Louisiana" "Louisiana" "Louisiana" "Louisiana" "Louisiana"
## [991] "Louisiana" "Louisiana" "Louisiana" "Louisiana" "Louisiana"
## [996] "Maine" "Maine" "Maine" "Maine" "Maine"
The number of COVID-19 Deaths, Pneumonia Deaths, and deaths involving both Covid-19 and Pneumonia, were all organized by age group. The same age range was included the initial analysis of age group and Covid-19 mortality. Data in which the gender not known, as well as overlapping age categories were excluded from this analysis. The numbers from the resulting analysis may be incomplete due to missing data and lag in the reporting of deaths due to all three conditions.
All of the preliminary tables and figures were made through knitrr and ggplot2.
The following table and figure analyze the relationship between age group and Covid-19 death:
| Age_Group | Covid_Deaths |
|---|---|
| 5-14 years | 39 |
| 15-24 years | 400 |
| 25-34 years | 1677 |
| 35-44 years | 4333 |
| 45-54 years | 11474 |
| 55-64 years | 27569 |
| 65-74 years | 46987 |
| 75-84 years | 57865 |
The table above lists the number of Covid-19 deaths for each age group in the United States from February to August 2020. The numbers range from 35 deaths, for those between 5-14 years old, to 52,617 deaths, for those in between 75-84 years old.
Figure 1 illustrates the number of Covid-19 deaths by age group in the United States from February to August 2020. The age group of 5-14 years was removed from the data set as the number of deaths due to Covid-19 was significantly less than the other age groups. There was 35 deaths from Covid-19 in the age group of 5-14, which comprised of <0.025% of the total Covid-19 death.
According to Figure 1, the number of deaths due to Covid-19 increased for every age group. The older the patient, the greater the Covid-19 mortality rate. The increase in the number of deaths was particular pronounced after the age of 55, as the number of deaths seems to increase exponentially for each increase in age group.
The following table shows the difference in Covid-19 deaths by gender in the United States from February 2020 to August 2020. The number of deaths
| Gender | Covid Death |
|---|---|
| Female | 82692 |
| Male | 125357 |
The number of males that have died due to Covid-19 is 114,291, while the number of females that have died due to Covid-19 is 75,203. These numbers are different from the total number of Covid-19 deaths calculated from the Covid-19 due to age distribution because different observations were omitted depending on the unknown or repetitive variables for each category.
Figure 2 illustrates the difference in the number of Covid-19 deaths in the United States by gender from February to August 2020. The cases in which the gender was unknown were removed from this figure. According to the data, there have been more cases of males dying due to Covid-19 than females. The ratio of males to females that have died due to COVID-19 is 1.520, indicating that the number of males that have died from Covid-19 is 1.520 times greater than the number of females that have died.
Figure 3 illustrates the number of Covid-19 deaths by age group and gender in the United States from February to August 2020. The gender distribution of the number of Covid-19 deaths for each age group mirrors that of the overall population of the United States. In each age group, the number of male deaths from Covid-19 are greater than that of females.
The following table shows the number of deaths for each condition (Covid-19, Pneumonia, Covid and Pneumonia) by age group.
| Age_Group | Covid_Deaths | Pneumonia_Deaths | Covid_and_Pneumonia_Deaths | |
|---|---|---|---|---|
| 5 | 5-14 years | 39 | 228 | 14 |
| 1 | 15-24 years | 400 | 1040 | 280 |
| 2 | 25-34 years | 1677 | 3736 | 1484 |
| 3 | 35-44 years | 4333 | 8368 | 3842 |
| 4 | 45-54 years | 11474 | 21876 | 10816 |
| 6 | 55-64 years | 27569 | 58568 | 27106 |
| 7 | 65-74 years | 46987 | 99622 | 46058 |
| 8 | 75-84 years | 57865 | 122076 | 54378 |
The number of deaths for each condition increased as the individual gets older, which is expected. The number of Covid deaths recorded does not include the number of deaths where both Covid-19 and pneumonia are found.
Figure 4 illustrates the number of deaths in which both Covid-19 and Pneumonia were involved. The number of cases with Covid-19 and pneumonia increase with age, and reaches the highest values at those between 75-84 years of age.
Figure 5 illustrates the percentage of deaths in which both Covid-19 and Pneumonia were present compared to the total number of deaths due to Covid-19. Throughout most age groups, the percentage of deaths in which both Covid-19 and Pneumonia are present account for around 40-50% of the total Covid-19 deaths. It stays relatively consistent throughout the age groups.
In conclusion, the number of deaths due to Covid-19 is influenced by age and gender. It was found that the number of deaths due to Covid-19 increase with age, with there being the most deaths from individuals in between the ages of 75-84. There was a greater number of Covid-19 deaths in males than females. Also, pneumonia was prsent in around 50% of the deaths due to Covid-19, with that percentage staying relatively similar throughout all the age groups. Although pneumonia is found in around half of Covid-19 cases, there is not enough information in this data set to determine if a co-diagnosis of pneumonia lead to an increase in mortality rate from Covid-19.